Evaluation of data mining techniques for suspicious network activity classification using honeypots data
نویسندگان
چکیده
As the amount and types of remote network services increase, the analysis of their logs has become a very difficult and time consuming task. There are several ways to filter relevant information and provide a reduced log set for analysis, such as whitelisting and intrusion detection tools, but all of them require too much finetuning work and human expertise. Nowadays, researchers are evaluating data mining approaches for intrusion detection in network logs, using techniques such as genetic algorithms, neural networks, clustering algorithms, etc. Some of those techniques yield good results, yet requiring a very large number of attributes gathered by network traffic to detect useful information. In this work we apply and evaluate some data mining techniques (KNearest Neighbors, Artificial Neural Networks and Decision Trees) in a reduced number of attributes on some log data sets acquired from a real network and a honeypot, in order to classify traffic logs as normal or suspicious. The results obtained allow us to identify unlabeled logs and to describe which attributes were used for the decision. This approach provides a very reduced amount of logs to the network administrator, improving the analysis task and aiding in discovering new kinds of attacks against their networks.
منابع مشابه
Detection of Breast Cancer Progress Using Adaptive Nero Fuzzy Inference System and Data Mining Techniques
Prediction, diagnosis, recovery and recurrence of the breast cancer among the patients are always one of the most important challenges for explorers and scientists. Nowadays by using of the bioinformatics sciences, these challenges can be eliminated by using of the previous information of patients records. In this paper has been used adaptive nero fuzzy inference system and data mining techniqu...
متن کاملIdentification of Fraud in Banking Data and Financial Institutions Using Classification Algorithms
In recent years, due to the expansion of financial institutions,as well as the popularity of the World Wide Weband e-commerce, a significant increase in the volume offinancial transactions observed. In addition to the increasein turnover, a huge increase in the number of fraud by user’sabnormality is resulting in billions of dollars in lossesover the world. T...
متن کاملIdentification of Fraud in Banking Data and Financial Institutions Using Classification Algorithms
In recent years, due to the expansion of financial institutions,as well as the popularity of the World Wide Weband e-commerce, a significant increase in the volume offinancial transactions observed. In addition to the increasein turnover, a huge increase in the number of fraud by user’sabnormality is resulting in billions of dollars in lossesover the world. T...
متن کاملCredit scoring in banks and financial institutions via data mining techniques: A literature review
This paper presents a comprehensive review of the works done, during the 2000–2012, in the application of data mining techniques in Credit scoring. Yet there isn’t any literature in the field of data mining applications in credit scoring. Using a novel research approach, this paper investigates academic and systematic literature review and includes all of the journals in the Science direct onli...
متن کاملUsing Combined Descriptive and Predictive Methods of Data Mining for Coronary Artery Disease Prediction: a Case Study Approach
Heart disease is one of the major causes of morbidity in the world. Currently, large proportions of healthcare data are not processed properly, thus, failing to be effectively used for decision making purposes. The risk of heart disease may be predicted via investigation of heart disease risk factors coupled with data mining knowledge. This paper presents a model developed using combined descri...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007